Comparing Distributional and Mirror Translation Similarities for Extracting Synonyms

نویسندگان

  • Philippe Muller
  • Philippe Langlais
چکیده

Automated thesaurus construction by collecting relations between lexical items (synonyms, antonyms, etc) has a long tradition in natural language processing. This has been done by exploiting dictionary structures or distributional context regularities (coocurrence, syntactic associations, or translation equivalents), in order to define measures of lexical similarity or relatedness. Dyvik had proposed to use aligned multilingual corpora and defines similar terms as terms that often share their translations. We evaluate the usefulness of this similarity for the extraction of synonyms, compared to the more widespread distributional approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Synonym extraction and abbreviation expansion with ensembles of semantic spaces

BACKGROUND Terminologies that account for variation in language use by linking synonyms and abbreviations to their corresponding concept are important enablers of high-quality information extraction from medical texts. Due to the use of specialized sub-languages in the medical domain, manual construction of semantic resources that accurately reflect language use is both costly and challenging, ...

متن کامل

Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction

We propose a novel vector representation that integrates lexical contrast into distributional vectors and strengthens the most salient features for determining degrees of word similarity. The improved vectors significantly outperform standard models and distinguish antonyms from synonyms with an average precision of 0.66–0.76 across word classes (adjectives, nouns, verbs). Moreover, we integrat...

متن کامل

Leveraging Paraphrase Labels to Extract Synonyms from Twitter

We present an approach for automatically learning synonyms from a corpus of paraphrased tweets. The synonyms are learned by using shallow parse chunks to create candidate synonyms and their context windows, and the synonyms are substituted back into a paraphrase detection system that uses machine translation metrics as features for a classifier. We find a 2.29% improvement in F1 when we train a...

متن کامل

Distributional Similarity of Multi-Word Expressions

Most existing systems for automatically extracting lexical-semantic resources neglect multi-word expressions (MWEs), even though approximately 30% of gold-standard thesauri entries are MWEs. We present a distributional similarity system that identifies synonyms for MWEs. We extend Grefenstette’s SEXTANT shallow parser to first identify bigram MWEs using collocation statistics from the Google WE...

متن کامل

Extracting Synonyms from Dictionary Definitions

Automatic extraction of synonyms and/or semantically related words has various applications in Natural Language Processing (NLP). There are currently two mainstream extraction paradigms, namely, lexicon-based and distributional approaches. The former usually suffers from low coverage, while the latter is only able to capture general relatedness rather than strict synonymy. In this paper, two ru...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011